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An important function of the Bell Laboratories Quality Assurance 
Center and the Western Electric Quality Assurance Directorate is to 
audit the quality of the products manufactured and the services 
provided by the Western Electric Company to determine if the in- 
tended quality standards are met. Until the sixth period of 1980, the 
t-rate system was used to make inference on the product quality. 
Starting the seventh period of 1980, the Quality Measurement Plan 
(qmp) has been implemented. The qmp is based on an empirical Bayes 
model of the audit- sampling process using the current and the pre- 
ceding five periods of data. Because it ignores the time order of the 
data, it is slow in responding to drifts in the process mean. The 
Quality Evaluation Plan (qep) has been designed to take into account 
the time order of the data and to be more sensitive to drifts in the 
process mean. In this paper we present the Quality Evaluation Plan, 
which uses the entire time series of data on a given product to 
determine if that product meets the quality standard. The time series 
is modeled by a stochastic process, which allows for the possibility 
that the process mean may drift or fluctuate around a fixed value. An 
adaptive Kalman filtering theory is developed for filtering out the 
sampling variance and obtaining the best estimate of the true defect 
index and its confidence interval. Thus, in qep the best estimate of 
the true defect index is obtained by a combination of adaptive 
exponential smoothing and shrinkage to the mean. The qep compu- 
tations are recursive, and the total computing efforts of qep and qmp 
are roughly equal. The paper contains several examples to illustrate 
the qep. 

I. INTRODUCTION 

An important function of the Bell Laboratories Quality Assurance 
Center and the Western Electric Quality Assurance Directorate is to 
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audit the quality of the products manufactured, and the services 
provided by the Western Electric Company to determine if the in- 
tended quality standards are met. This is achieved by dividing the 
products and services into some 3000 homogeneous classes. A small 
sample is taken from each class during each period (there are eight 
rating periods in a year). Based on this data, an inference is made in 
each period regarding the compliance of each class to the quality 
standard. 

Until the sixth period of 1980, the t-rate system, evolved from the 
work of Dodge and others, 1 was used to rate the product quality. 
Starting with the seventh period of 1980, the Quality Measurement 
Plan (qmp) was implemented. The qmp, developed by A. B. Hoadley, 2 
is based on an empirical Bayes model of the audit-sampling process. It 
uses the current and the preceding five periods of data. It represents 
a considerable improvement in the statistical power for detecting 
substandard quality as compared with the old rules based on the t- 
rate. However, qmp ignores the time order of the observations, so it is 
less sensitive to drifts in the process mean. The Quality Evaluation 
Plan (qep) has been designed to take into account the time order of 
the data and to be more sensitive to drifts in the process mean. 

The object of this paper is to present the Quality Evaluation Plan, 
which uses the entire time series of data on a given class to determine 
if that class meets the quality standard. The time series is modeled by 
a stochastic process, which allows for the possibility of the process 
mean to (i) drift or (ii) fluctuate around a fixed value. An adaptive 
Kalman filtering theory is developed for filtering out the sampling 
variance and obtaining the best estimate of the true defect index and 
its confidence interval. Some of the salient features of qep are: (i) the 
best estimate of the defect index is obtained by an adaptive exponential 
smoothing process, making qep more responsive to shifts and drifts in 
the process mean; (ii) the qmp model is a special case of the general 
model proposed here; and (Hi) the computational method is recursive. 

This paper is divided into nine sections. We describe the model in 
Section II. Section III gives the Kalman filter solution of the model. 
Adaptive estimates of the model parameters are developed in Section 
IV, and in Section V we modify the Kalman filter solution of Section 
III to reflect the fact that the model parameters are estimated. The 
solution thus obtained is the adaptive Kalman filter. The construction 
of the box chart for displaying the results, and the rules for the 
exception report are spelled out in Section VI. The selection of the 
starting values for the estimation of the model parameters and the 
adaptive Kalman filter solution is discussed in Section VII. The 
algorithm has been tried on a number of rating classes and also on 
simulated data. We present representative examples in Section VIII. 

2082 THE BELL SYSTEM TECHNICAL JOURNAL, OCTOBER 1 982 



Some numerical comparison between qep and qmp is also presented in 
that section. Finally, in Section IX, we discuss the features of the qep 
and other potential applications of the adaptive Kalman filtering 
methodology developed in this paper. A summary of the qep formulae 
is given in Appendix C. 

Parts of the derivation of qep are heuristic. The heuristic has a 
sound theoretical foundation under two assumptions: (i) the audit 
sample size for a rating class does not vary in time by orders of 
magnitude, and («') the maximum likelihood estimates of the time 
series parameters fall within their feasible region. These assumptions 
are satisfied in about 95 percent of the audit examples, qep appears to 
work for the other 5 percent as well, but this has not been fully tested. 

II. DESCRIPTION OF THE MODEL 

Let t denote the true defect index in period t for the particular 
rating class under study. Thus, 

(Total number of defects present in 
the production of period t 

o> = -, : r- 

/ Total number of defects allowed in \ 

\ that production under the quality standard) 

In deriving the present qmp, Bruce Hoadley 2 assumed that over a time 
window of six periods the successive values of 0, are independently and 
identically distributed around a fixed mean, called the long-term 
process mean. Consequently, the time order of the past observations 
is ignored in estimating the current defect index. Hence, qmp responds 
to a drift in the defect index only through having the moving window, 
which means a slow response. In our model we will overcome this 
deficiency by explicitly allowing for drift and serial correlation. 

The mathematical analysis of serially correlated data is greatly 
simplified when the random variables involved are normally distrib- 
uted. The audit problem can be put in this framework by the square 
root transformation described in the following paragraph. 

For the chosen sample, let e t be the expected number of defects 
under standard quality, x t be the observed number of defects, and 
/, = xi/e, be the observed defect index; then x, has a Poisson distribu- 
tion with mean e t 6,. It is well known that the distribution of sxt can be 
approximated by the Gaussian density with mean veift and variance 
'/4. Let Y t = v7/. The distribution of Y, is approximately normal with 
mean V& and variance 0.25/e,. We shall denote \/#, by ft and refer to 
it as the transformed true defect index, or simply as the true defect 
index. 
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When the observations are defined in terms of demerits or defectives 
we will take x t and e t equal to the observed and the expected equivalent 
defects, respectively, as defined by Hoadley. 3 In this case the distri- 
bution of x t is approximately Poisson with mean 6 t e t , so we can still 
use the square root transform defined in the previous paragraph. 

Autoregressive-integrated-moving average models with appropriate 
trend terms may be used to characterize a wide range of serial corre- 
lations and trends. However, since the available data on each product 
is limited, it is essential that we keep the structure simple, involving 
only a few parameters. Thus, we propose the following model for the 
variation of £,: 

St ■ rn t + vu, (1) 

where mt is the trend term (including the mean) and v\t is the deviation 
from the trend. The successive values of Vu will be assumed to be 
independently distributed with zero mean and variance a\ t . 

Since the exact nature of the drift is not known to begin with, we 
shall assume that m t is a random walk. We found in control engineering 
literature 4 that the random walk model serves well in tracking a variety 
of trends in unknown parameters, and therefore we chose to use it in 
the present problem. Thus, 

m t = mt-i + v 2t , (2) 

where v%t is a sequence of independently distributed random variables 
with mean zero and variance a\ t - Further, the sequence V2t will be 
assumed to be independent of the sequence v u . 

Equations (1) and (2) thus characterize the variation of the defect 
index — the component m t describes the low-frequency (smooth) 
changes, while vu describes the high-frequency changes in &. If we 
take alt = and a u = a? = constant, then these equations imply that 
the &'s and hence t 's are independently and identically distributed. 
Thus, the qmp model is a special case of the general model of this 
paper. 

The transformed observed defect index, Y t , is the transformed true 
defect index plus the sampling error, rjt. Thus, 

Yt=$t + T)t. (3) 

As discussed earlier, the expected value of Y t is £* and the variance of 
Y t is 0.25/e,, so -q c has zero mean and its variance is equal to 0.25/ 'e t . 
We assume that the successive random variables tj* are independent. 
Also, since the origins of r\ t , v\ t and V2t are unrelated, we assume that 
these three series are mutually uncrosscorrelated. Further, the distri- 
butions of vu, V2t, and tj, are assumed to be normal. The justification 
for this assumption comes from the fact that Yt's are approximately 
normally distributed. 
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The problem at hand is to make an inference on 6 n given data up to 
and including the nth time period. In particular, we wish to determine 
the posterior probability of the event that 6 n exceeds one. 

III. KALMAN FILTER SOLUTION 

In the Kalman filter terminology, ft, and m„ are the unobserved state 
variables about which we wish to make inference using the observa- 
tions Yi, '•' , Y n . Let us, for now, assume that the model parameters 
a\ t , alt, and a% are known for t = 1, * • • n and that the means and the 
variances of ft> and rm are known. Then the Kalman filter provides 
recursive formulae for estimating the posterior means and variances of 
ft, and m n . The derivation of the general Kalman filter may be found 
in a number of books (e.g., see Jazwinsky 5 or Gelb 4 ). A simple deriva- 
tion for the special case of the audit model is given in Appendix A. 
The desired recursive formulae are given below. 

Conditional on the data up to time t, the distribution of mt is normal 
with mean m t and variance q t , 

i.e., m t \t~ N(m t , q t ), 

where m t = W2tm t -i + (1 — <C2t)Y t (4) 

q t =a-o> 2t )(ou + o$t) (5) 

°u + o nt . 

alt + a* t + ait + qt-i 

Likewise, conditional on the data up to time t, the distribution of & 
is normal with mean ft and variance p t , 

i.e., ft ~ N(ft, p t ), 
where ft = 032t<^un\t-i + (1 — U2t*>>u)Yt (7) 

p t = (1 - b32tU\t)a% (8) 

(02f(0u = — 2 — < 2 / 2 , • (9) 

a^t + ait + ait + qt-\ 

To use these recursive equations the starting values mo and qo must 
be specified. The choice of these values is discussed in Section VII. 
For now, we note that as t -» oo, the effect of the starting values 
reduces to zero. 

Notice that eq. (4) is an adaptive, exponential smoothing equation. 
The smoothing constant, co 2t , is a function of time and is determined 
by the relative values of the different variances as given by eq. (6). 
Observe that V(Y t \m t ) = ah + a% so that ou + a% measures the 
uncertainty in using Y t for estimating m t ; also, alt = V(m t \mt-i) and 
q,_ x = V(m t -i\t - 1). Thus, alt + qt-i is a measure of uncertainty in 
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using m t -i for estimating mt. It is clear from eqs. (4) and (6) that the 
weights given to Y t and m t -\ are inversely proportional to their respec- 
tive uncertainties in estimating m t . 

Equations (7) and (9) are the analogous equations for computing 
the posterior mean of &. Note that V(Y t \&) = a% is the uncertainty 
in using Yt to estimate fr. Further, V(&\m t -i) = ah + alt and 
V(m t -\ 1 1 — 1) = q t -\ so that a\ t + alt + Qt-i is the uncertainty in using 
m t -i to estimate &. The weights on Y t and m t -\ are thus seen to be 
inversely proportional to the respective uncertainties. 

From eq. (8) we note that the posterior variance of & conditional on 
data up to time t is smaller than a\ t by the factor (1 — «2tWit). Thus, 
the factor (1 — o)2tUu) represents the advantage of filtering in estimat- 
ing &. Similarly, from eq. (5) we see that the factor (1 — u 2l ) is the 
benefit of filtering in estimating m t . 

To compare the qmp model given in Ref. 2 with the qep model we 
shall rewrite eqs. (7) and (9) as follows: 

It = 03 U m t + (1 - ion) Yt 

w " - .2 7 Ji * 



a v t + 0\ t 

Analogous to the qmp, eq. (7) expresses It as a weighted sum of m t , the 
estimated current mean level, and Y t , the current observation. The 
weight ecu is analogous to the shrinkage constant given in Ref. 2, and 
we will also call it a shrinkage constant. 

The discussion of this section was based on the assumption that a\ t , 
alt, and a% are known quantities. However, in the audit problem a\ t 
and alt are not known and must be estimated from the observed data. 
In the following section we will derive the estimates of a\ t and alt, and 
in Section V we will modify eqs. (4) through (9) to accommodate the 
fact that a\ t and alt are estimated. 

IV. ESTIMATION OF THE MODEL PARAMETERS 

Consider the case where o\t = a\ = constant, alt = al = constant, 
and a\ t = a\ = constant. 

Let us define Z t = Y t — Yt-i. Under the assumed model E(Z t ) = 
and the autocovariances of Z t are given by 

£(Z?) = 2a? + <x! + 2aS, (10) 

E(ZtZ t -i) = -a\ - a\, (11) 

and 

E(Z,Z t -i) = 0, Z>2. (12) 
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Thus, Z t is a first-order moving average [ma(1)] process of zero mean, 
i.e., Z t can be represented as 

Zt-at + flat-u (13) 

where a t is a white noise series of variance a 2 . The parameters ($ and 
a 2 are related to a?, a 2 , and a 2 , through the autocovariance function; 
i.e., 

E(Z 2 ) = (1 + /?V = 2a? + a\ + 2a 2 , (14) 

and 

E(Z t Z t -i) = (So 2 - -a? - a 2 . (15) 

Solving eqs. (14) and (15) we have 

a? = -0a 2 - a 2 (16) 



and 



a! = (1 + /?)V 



(17) 



The nonnegativity of a 2 and the invertibility of the model given by eq. 

(13) impose the following restriction on /?: -1< /? < -a 2 /a 2 . Thus, the 
feasible region for /? and a 2 is the one enclosed by the lines: (3 = -1, 
p a 2 = -a 2 ,, and a 2 = oo. The region is shown in Fig. 1. 

The parameters /? and a 2 can be estimated using a suitable time- 
series method. Once p and a 2 are known, eqs. (16) and (17) may be 
used to estimate a 2 and ai. 

Contrary to the assumption made at the beginning of this section, 
a\ t , alt, and a 2 , may in fact vary from period to period. Through eqs. 

(14) and (15) this implies that ft and a 2 may vary with time. In other 
words, Zt is like a MA(1) process with changing parameters. Therefore, 




Fig. 1— Feasible region for /? and a 2 . 
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we need an adaptive, recursive estimation method for estimating /? and 
a 2 , rather than the usual time series estimation methods. The recursive 
method of Phadke 6 will therefore be used here. The method discounts 
the past data exponentially and thus can respond to changes in the 
model parameters. The necessary recursion formulae are given below. 
Appendix B gives the derivation of these formulae. 

p l = /3o-R7 1 Pt (18) 

a? = Stifi t )/A t , (19) 

where 

ddt 
„, = Ai',-i + 2a,— (20) 

dp 

*-* fl " + 2 ($)' <21) 

a t = Z t - Poa,-i (22) 

da./ da.t-1 

__ = _ a ,_ 1 _ /?0 __ (23) 

S,(Po) = XS,_,(/? ) + a 2 (24) 

S t (fi,) = S,(fo) + (fit - fa)Pt + 'Mfit - Po?Rt (25) 

A, = \A t -! + 1 . (26) 

The choice of the starting values for these recursions will be studied 
in Section VII. The parameter X, < X < 1, determines how fast the 
old data is discounted in estimating the model parameters. X = 1 
implies that the entire past data is used. The smaller the X the faster 
the past data is discounted. 

The estimates of and p t are uncorrelated and have the following 
approximate variances: 

V(fi t ) <* 2o 2 /R t (27) 

V(o?) * 26V At. (28) 

The estimated values of a 2 and /? may be substituted in eqs. 
(16) and (17) to compute a\ t and ah. When a% varies with time, we 
should use the exponentially smoothed value, al t , as defined below: 



ait = X<,_i + (1 - XK„ (29) 

in place of a% in eq. (16). Thus, 

a 2 — a a 2 ^2~ 
<nt — —ptOt — a^ t , 
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and 

61 = (1 + fitful 

In the rare case of extreme variations (order of magnitude variations) 
in a 2 t , eq. (29) has not been fully tested, so we urge caution for such 
cases. 

It is possible that eqs. (18) and (19) would yield infeasible values of 
fit and of . In that case we propose the following truncation rules, which 
take us close to the maximum point on the feasible boundary. 

Step 1: Truncate fit to the region [-1, 0]. Denote the truncated 
value by fit, where 

(fio - v,/Rt if -1 < fio - vt/Rt < 
fit = \ if fio - vt/Rt > . 

L-l if fio- vt/Rt < -1 

Step 2: Compute at 2 and h\ t \ 

at 2 = St(fit)/A t = {St(fio) + (fit - fio)v t + K(fit - fio) 2 Rt)/A t 
ah =(1 + fit) 2 at 2 . 

Step 3: If fit and at 2 belong to the feasible region, i.e., if (—fit of 2 
- ajt) > then fit = fit, of = at 2 , and a\ t = -&# - a% t . 

Step 4: If §t_ and at 2 do not belong to the feasible region, i.e., 
if (—fito* 2 — <r^) < 0, then set a\ t = 0. Compute & and a 2 by solving 
the following two equations: 

6l = -fit6 2 -ajt = 0, 

and 

Bh = (1 + fi t ) 2 6l 

The resulting fit and a 2 are given by 



t _ -(2 + al/al.) + V(2 + al/aj,) 2 - 4 
fit 2 

and 

a 2 = - 5/A. 

Note that when these truncations are applied there will be a larger 
degree of approximation involved in using eqs. (27) and (28) for 
computing the variances of fit and a 2 . However, these variances enter 
only into the secondary terms of the adaptive Kalman filter to be 
derived in the next section. Consequently, we may ignore the effect of 
truncation. 
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V. ADAPTIVE KALMAN FILTER 

In deriving the Kalman filter solution of Section III, it was assumed 
that a'i, and a\ t are known quantities. Since in the audit problem these 
parameters are estimated using the observed data, we need to make 
due modifications to the Kalman filter solution. We will refer to the 
resulting formulae as the adaptive Kalman filter. 

Consider the distribution of m t conditional on data up to time t: 

m, = E(m,\t) - EE(m,\t, a 2 ,, alt) = E(w 2 tm t -i + (1 - U2t)Y t ); 

hence, 

m, « <2>2tih,-i + (1 — <02t)Yt (30) 

where 

A 2 2 

a{ t + a\ t + ait + qt-\ 

The distribution of U2t conditional on data up to time t is very 
complicated. So the expected value of u 2t cannot be simply computed. 
Therefore, in eq. (30) we have approximated E(u>2t\t) by u>2t, the 
maximum posterior density point. This approximation would be very 
good when fit and a 2 he inside the feasible region. However, if fit and 
a 2 lie on the boundary of the feasible region, a>2< will be a less accurate 
approximation of E(u2t\t). The extent of the inaccuracy will depend 
on the values of R t and A t . For large R t and A t the likelihood of fit and 
a? will drop very rapidly as one goes away from the feasible boundary 
nearest to the maximum likelihood point. Thus, the inaccuracy would 
be smaller for larger values of R t and A t . 

Now consider the variance of m t given data up to time t: 

q t = V(m t \t) = EV(m t \t, alt, alt) + VE(m t \t, a\ t , alt) 

= E[(l - U2t)(al + ai)] + V[o32tmt-i + (1 - «»)*<]; 
hence, 

qt « (1 - £>2t)(a\t + a%) + (Yt - 4-i)V(« B ) . (32) 

The effect of fit and a 2 lying on the feasible boundary will be to 
introduce an inaccuracy in the term (1 — o)2t)(6 2 t + a%) of eq. (32), as 
discussed above. Knowing the variance of fit and aj, the variance of 
u 2 t can be derived via the Taylor series approximation as follows. We 
have 

_ 6\ t + a^ 

W 2f — To — ; 5 — ! — ^5 — i 

ait + ait + oh + qt-i 

—fitOt — a v t + a v t 



(l+fi,+ fi 2 )a 2 + q t -i + a\ t - a 2 t 
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d(^2t\ , n ft x ( d0)2t J , ., . 2 > 



Hence, to the first-order approximation the variance of <02* is 



= {2<j?[1 + (02,(1 + 2fi t )f/R t 

+ 2310, + (1 + A + fi)uuT/A t ) 

+ (6\t + o%+6l t + q t -x) 2 . (33) 

Note that the smoothing constant a>2< is restricted to the interval 
[0, 1]. The most noninformative distribution on this interval is the 
uniform distribution whose variance is V12. The £)2t computed by eq. 
(31) clearly adheres to the interval [0, 1]. However, because of the 
approximations involved, the computed V{u2t) may come out larger 
than M2. In that case we propose to truncate it to Va. 

When A and a 2 t lie on the boundary of their feasible region, the use 
of the Taylor series approximation would yield inaccurate estimates of 
V(u 2 t). Since the contribution of this variance is secondary in comput- 
ing V(m t 1 1), we may ignore the effect of truncation. 

We can proceed in an analogous way to compute E(£ t \t) = & and 
V(&\t)=p t to yield: 

it = U2t&um t -i + (1 - U2t">it)Yt, (34) 



where 



w « 2,-2 



(35) 



and 



where 



O n t + Ou 

p t = (1 - O2twu)o$t + (Y t — m t -i) 2 V{u2tU\t) , (36) 

V(«2f«u) * [2a?(l + 2$t) 2 <a\&/Rt 

+ 26U1 + A + $)&*&/&] 

+ (6 2 u + o 2 , t + 6 2 2t + q t - 1 ) 2 . (37) 

For the reasons discussed in the case of V(u 2 t), if eq. (37) yields a 
value of V(a>2<<oi/) > Via, then we will truncate it to V12. 

The effects of A and 0? lying on the feasible boundary are similar to 
those explained in connection with m t and q t . 

VI. BOX CHART AND THE EXCEPTION REPORT 
In Ref. 2 Bruce Hoadley has proposed a format for displaying the 
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conditional distribution of 6 t given data up to time t. He has also 
proposed exception rules in terms of this distribution. We shall use the 
same reporting format and the exception rules. 

The conditional distribution of B t will be summarized by a box chart 
that shows the 99, 95, 5, and 1 percentiles of the distribution, the best 
estimate of t , denoted by h, the mean level, M t , and the current 
defect index I t . By applying the inverse square root transformation, 
we have 

M t = m 2 and 6 t — f 2 . 

The quantiles of t are once again obtained by squaring the quan tiles 
of &. Since & is restricted to be positive, and we have approximated its 
density by the normal distribution, we may have to truncate some of 
the extreme quantiles to zero. If we take this fact into account, the 
desired quantiles of t are: 

Qi = 99% quantile = [max(£ - 2-326^, 0)] 2 

Q 2 = 95% quantile = [max(£ - 1-645 >fp t , 0)] 2 

Q 3 = 5% quantile = (f, + 1-645 v^) 2 

Q 4 = 1% quantile = (fc + 2-326 \/p,) 2 . 

A sample box chart is shown in Fig. 2. 

The exception rules are: 

(i) Below Normal: A rating class will be declared below normal if 
the posterior probability of & t being larger than one exceeds 0.99. 

(ii) Alert: An alert will be declared for a rating class if the posterior 
probability of 6 t being greater than one exceeds 0.95 but it is less than 
or equal to 0.99. 



2 - 









Q 4 1% QUANTILE 
Q 3 5% QUANTILE 

A 

--- e 

A 

M QUALITY STANDARD 
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Q 2 95% QUANTILE 
Q, 99% QUANTILE 




1 



Fig. 2 — Sample box chart of the conditional distribution of t . 
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So in terms of the quantiles derived above, we will declare below 
normal if Qi > 1 and alert if Q, < 1 but Q 2 > 1. 

VII. CHOICE OF STARTING VALUES 

The Kalman filter solution described in Section III and the estima- 
tion of model parameters described in Section IV are both recursive 
procedures that must be appropriately initialized. Note that since each 
of these procedures discounts the past data, the effect of initialization 
diminishes to zero as more data is accumulated on any rating class. So 
any biases introduced by the initialization process are transient and 
temporary. The best way to choose the initial values is by analyzing 
the historical data on all rating classes. Pending such an analysis, we 
shall tentatively choose the initial parameter values, as follows. 

We will take m = 1.0 and q = 0.134. This amounts to choosing a 
very diffused prior distribution on the mean level. On the square-root 
defect-index scale the lower and the upper one percentiles of this 
distribution are 0.149 and 1.851, respectively; while on the defect-index 
scale the lower and the upper one percentiles are 0.022 and 3.428, 
respectively. The mean and the median of this distribution are equal 
to one on either scale. Consistent with this we will choose Y = 1.0. 

The parameter a 2 , should be taken equal to the variance of the 
transformed defect index associated with the planned equivalent ex- 
pectancy for a period's sample for the particular rating class. In the 
present analysis we will take e = e\ and a\o = ai,, = 0.25/e . 

The parameter A determines how many periods of data are effec- 
tively used in estimating the time series parameters (S, and a 2 and, 
hence, the parameters o\ t and oh. We will take A = 0.95, which implies 
that effectively 1/(1 - A) = 20 periods of data are used in estimating 
the model parameters. 

We also need to specify the values of /So, Oo, dao/dfi, So(/?o), eo, Ro, 
and Aq . All these variables enter into the recursive maximum likelihood 
estimation of /? and a 2 . We shall take /? = -0.6, which is an approximate 
midpoint of the feasible range of /?. The quantities ao and dao/dfi will 
be taken equal to their respective expected values, namely, zero in 
each case; and Ao will be set equal to its steady-state value, namely, 
1/(1 - A). We will take v = 0, So(fio) = 0.625/[e (l - A)], and R = 

20.0/eo. 

The above starting values imply that at t = the mean and the 
variance of /? are respectively -0.6 and 0.063. The variance of the 
uniform distribution on the (-1, 0) interval is Via = 0.083. Since the 
feasible interval for fi is smaller than (-1, 0), the variance of 0.063 
represents a fairly diffused initialization. 

Also, the above starting values imply that the mean and the variance 
of a 2 at t = are 2.5 a 2 * and 0.625(0^ ) 2 . Therefore, by the gamma 
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density assumption, the 95-percent confidence interval on a 2 is (1.2 ojjo, 
4.28 a%), which is a very wide interval. 

The values of a 2 and a 2 implied by the above starting values are 
0.5 a^o and 0.4 a^, respectively. 

VIII. ILLUSTRATIVE EXAMPLES 

To illustrate the properties of the quality evaluation plan we shall 
now present six examples. The first three examples are the simulated 
examples, while the latter three use real audit data. 

Example 1: Figure 3a shows the response of qep to a sudden shift 
in the quality level. For the first ten periods the observed defect index 
fluctuates randomly around the fixed level 3.0. From the eleventh to 
the twentieth period the observed defect index is fixed at 1.0. In each 
period the expectancy at standard is 5.0. Notice that starting with the 
eleventh period the estimated mean level rapidly approaches the new 
mean level. Also, starting with the eleventh observation the product 
gets off the exception report. Figure 3b shows the corresponding results 
for qmp. It is clear that in terms of both $[ t and % the response of qep 
is quicker than the response of qmp. 

Example 2: Figure 4a displays the response of qep to a linear trend 
in the quality level. As in the case of Example 1, for the first ten 
periods the observed defect index randomly fluctuates around the fixed 
level 3.0. From the eleventh to the twentieth period the observed 
defect index has a linear downtrend, as plotted. In all twenty periods 
the expectancy at standard is 5.0. Notice that both M t and t follow 
the trend with a small lag. Also note that the qep algorithm recognizes 
that the process has a drift rather than random fluctuation. Conse- 
quently, M t and % are very close while following the drift. Figure 4b 
shows the results of qmp for the same data. Here again, M t and % 
follow the trend, but the lag is much larger. This is manifested in the 
fact that qep gets the product off below normal in the seventeenth 
period while with qmp that happens a period later. 

Example 3: This example illustrates that qep and qmp have similar 
behaviors when the defect index fluctuates about a fixed value for a 
long period of time. Figure 5a shows the results of qep when the defect 
index fluctuates around the fixed level of 2.0, while Fig. 5b shows the 
results of qmp with the same data. Note that both the methods declare 
below normals and alerts in the same periods. 

Example 4: Figure 6a gives the data for repaired remreed grids, 
rating class OC038TT, for periods 7801 through 7904. The periods are 
numbered 1 through 12 in the figure. The qep results are also shown 
in the figure. Similar results with qmp are plotted in Fig. 6b. In 
response to the drift in the quality we see that qep attaches a heavier 
weight to the current data. Consequently, with qep the mean level, Mt 
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Fig. 3— Response to a sudden shift in the quality level for: (a) qep, and (b) qmp. 

follows the drift more closely than with qmp. In the period 5, recogniz- 
ing the drift, qep takes the product off the exception report while qmp 
still calls it an alert. Also period 7 is an alert according to qmp while, 
according to qep, it is off the exception report. These differences 
between qep and qmp are clearly seen to be the result of the fact that 
qep exponentially discounts the past data, while qmp considers every 
observation in the six-period window to be equally important. 

Example 5: Figures 7a and 7b give the results of qep and qmp, 
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Fig. 4 — Response to a linear trend in the quality level for: (a) qep, and (b) qmp. 



respectively, for the rating class OH060CM, consisting of modular 
telephone chords. The periods covered by the chart are 7701-7808. As 
we saw in Example 4, qep follows the drift more closely. In terms of 
the exception report, there are several differences. In periods 8, 15, and 
16 qmp declares below normal, while qep calls it only an alert. In 
period 10 qmp calls it an alert while qep does not declare any exception. 
These differences are once again a result of the fact that qep recognizes 
the drift and hence heavily discounts the past data. 
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Fig. 5 — Response to a random fluctuation in the quality level for: (a) qep, and (b) 
BMP. 



Example 6: The last example to be considered is the rating class 
MV104MJ. The results of both the methods are shown in Fig. 8. In 
this example the quality fluctuates more or less randomly about a 
fixed mean and, as expected, the two methods give comparable results. 

The average values of the weights iou and u>2t, and the equivalent 
expectancies e, for the three audit examples are tabulated in Table I. 
Notice that average value of u 2 t for OC038TT and OH060CM is 0.48 
compared with 0.55 for MV104MJ. This is a direct consequence of the 
fact that MV104MJ does not exhibit a drift while the others do. The 
high-frequency fluctuation about the mean function M t is depicted by 
uu. Relative to the sampling variance (0.25/e,) OC038TT exhibits a 
smaller fluctuation than OH060CM. This concurs with the average 
values of u u for these rating classes. 
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Fig. 6— Results for the rating class OC038TT, periods 7801 through 7904 for (a) qep, 
«d(b) 



and (b) qmp. 



Through the preceding examples it is quite apparent that qep and 
qmp could give somewhat different results. Now the key question is: 
Which method yields a more precise estimate of the unobserved "true 
defect index"? The only way to answer this question decisively is to 
take a 100-percent sample of a number of rating classes to find out the 
true defect indices and compare them with the qep and qmp results. 
This is obviously an impossible task. A feasible way to answer the 
question is by using the models to predict one step ahead and compare 
the mean-squared prediction errors. Note thatM,-i is a predictor of//. 
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Fig. 7— Results for the rating class OH060CM, periods 7701 through 7808 for: (a) qep, 
and (b) qmp. 



The mean-squared errors for the three audit data examples, viz. 
Examples 4, 5, and 6, are given in Table II. For the rating class 
OC038TT we notice that the mean-squared prediction error (m.s.p.e.) 
of qmp is 33 percent larger than that of qep; for OH060CM the m.s.p.e. 
of qmp is 11 percent larger, and for MV104MJ the m.s.p.e. of qmp is 
only 3 percent larger. Thus, whenever there is a drift in the quality we 
may expect qep to perform better than qmp, whereas if the quality 
fluctuates randomly around a fixed mean, both qmp and qep would 
give similar results. 

Effect of truncation: In addition to the numerical examples cited 
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Fig. 8— Results for the rating class MV104MJ, periods 7701 through 7808 for: (a) qep, 
and (b) qmp. 



above, a limited numerical study was made with thirteen representa- 
tive rating classes. Each rating class had about fourteen periods of 
data. This represents a total of 182 test periods. Among these examples, 
truncation occurred in only 7 percent of the periods. Except for one 
case, all the truncations caused a\ t = 0. These cases of truncation could 
be recognized broadly as situations where the variance of the observed 
defect indices was much smaller than that for the Poisson distribution. 
In each case of truncation the confidence intervals computed by qep 
looked reasonable and comparable to those obtained by qmp, so we 
can tentatively conclude that the effect of truncation is negligible. Of 
course, an extensive trial of qep may suggest some modifications to 
the truncation rules. 

One such modification may be to view the likelihood function of fit 
and a 2 as the posterior-probability density function. Then the Bayes 
estimates of fit and a 2 may be used in place of the maximum likelihood 
estimates used in this paper. Because of the complexity of the feasible 
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Table I — Computed qep weights for the 
examples 



Average Value of 


Rating Class 


Uii 


W2( e< 


OC038TT 
OH060CM 
MV104MJ 


0.83 
0.67 
0.75 


0.48 3.7 
0.48 7.9 
0.55 4.6 


Table II — Comparison of the mean- 
squared prediction error 




I 


vlean Squared Predic- 
tion Error 


Rating Class 


QEP QMP 


OC038TT 
OH060CM 
MV104MJ 




0.69 0.92 
0.94 1.04 
0.29 0.30 



region, computing Bayes estimates would involve extensive numerical 
effort, which may be unnecessary. 

IX. DISCUSSIONS 

In summary, the qep model consists of two parts — the system model 
and the observation model. The system model states that the trans- 
formed true-defect index is equal to the process mean that follows the 
random walk model plus process fluctuation, which is statistically 
independent from period to period. The random walk model takes care 
of the process drift. The observation model states that the transformed 
observed defect index is equal to the transformed true-defect index 
plus sampling error with a known variance. The different parameters 
of the qep model are estimated from the observed data by the recur- 
sive, exponentially discounted, maximum likelihood method. The suc- 
cessive transformed true defect indices and the process mean levels 
are then estimated by the adaptive Kalman filtering algorithm. 

From the derivation of the plan and the illustrative examples the 
following advantages of qep are apparent: 

(i) The qep model takes into account the time order of the 
observations, while in qmp the time order of the observations is 
ignored. 

(ii) The best estimate of the process mean level is obtained by an 
adaptive exponential smoothing procedure. This makes qep more 
responsive to the shifts and drifts in the process level. This is evidenced 
by the lower mean-squared prediction error for the examples discussed 
in Section VIII. 
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(Hi) The qmp model is a special case of the qep model. However, 
the two algorithms are quite different. 

(iu) The computations are recursive. The entire past data are 
summarized by ten numbers. 

(v) The computational efforts of qep and qmp algorithms are 
comparable. 

In the light of the advantages listed above it is proposed that qep be 
considered as a serious alternative to qmp for official rating. In prep- 
aration for using qep it is suggested that it be tried on all rating classes 
for a number of rating periods, and the resulting exception reports 
carefully compared with those from the qmp and the £-rate system. 
Such a study would aid us in fine tuning the starting conditions, 
quantifying the effect of truncation, and perhaps in making some other 
minor modifications for improving the performance of the qep. 

For small expectancies, the square root transform of the Poisson 
distribution has a significantly different variance than 0.25, assumed 
in Section II. Since the audit samples can at times be very small it 
would be necessary to use the correct variances. A study of this aspect 
will be done in a later memorandum. 

The adaptive Kalman filtering methodology derived in this paper, 
with appropriate extensions and modifications, can be put to many 
other applications. In the field of quality control, Phadke 7 had devel- 
oped a sequential empirical Bayes acceptance sampling plan. The 
adaptive Kaman filtering method developed in this paper would be 
particularly suited for updating the empirical prior distribution. An- 
other potential application is in combining the traditional X and R 
control charts into a single box chart. Here the adaptive Kalman filter 
would permit one to take into account serial correlation in the data as 
well as process drifts and shifts, and changes in the process variance. 
Yet another application is in adaptive time series forecasting. 
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APPENDIX A 

Derivation of the Kalman filter solution 

Let the conditional distribution of m t -i given data up to time t - 1 
be normal with mean m,-i and variance q t -i, i.e., 

m^t-l-Nimt-uQt-i). (38) 

Eq. (2) expresses m, as a sum of two independent normal random 
variables m,-\ and v 2l . Since the mean and variance of v 2l are respec- 
tively and ah, it follows that 

mt\t-l~N{m t -i,oh + qt-x). (39) 

Substituting eq. (1) in eq. (3) we have 

Y, = m t + Pu + yt, (40) 

which implies that 

Yt\mt~N{m t ,o\ t +o\ t ). (41) 

In the Bayesian framework we may view eq. (39) as a prior distri- 
bution on m,, and Y, as an observation of m, with the distribution 
specified by eq. (41). Applying the Bayes theorem the distribution of 
m, conditional on data up to time t is seen to be 

M , v f (m,-m,_i) 2 (Y,-m,) 2 

Am^)oce X p|- 2(<4 + ^ i) - 2(<A+ ^ ) 

^J.^LZ^l), (42) 



2q f 



where 



and 



m, = U2 t m,-i + (1 — U2t)Y t , (4) 

U2t = (oit + al,)/{a\, + a\, + a\ + q,-i) , (5) 

q t = (I - U2t){ol + o%) . (6) 

From eq. (42) it can be inferred that the distribution of m, conditional 
on data up to time t is normal with mean m, and variance q t . 

Equations (7) through (9), used for computing the conditional dis- 
tribution of f,, can be derived analogously as follows. First by substi- 
tuting eq. (2) in (1) we have 
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& = mt-i + vu + v 2t ; (43) 

hence, 

&| t - 1 ~ 2V(m,-i , a?, + ai, + g,-,) . (44) 

From eq. (3) we have 

Yt\$t~N(U t o%). (45) 

Treating eq. (44) as a prior distribution for & and applying the Bayes 
theorem, we readily obtain the distribution of £, conditional on data 
up to time t as 

(£-m,_i) 2 (Y t - 



f(S,\t) ccexp - 



W<) 2 1 
2a 2 , J 



2(a?, + a|, + q t -i) 
ocexp^ J, (46) 



(fr ~ fc) 2 ] 
2p, J' 



where 



and 



t = (01,(02(^,-1 + (1 — UilU2l)Y f , (7) 

uufafe = aj//(crjf + ah + alt + Qt-i), (8) 

/>, = (1 - o)i,co 2 ,)<t5<- (9) 

Thus, the conditional distribution of f, on data up to time t is normal 
with mean \, and variance p t . Equations (7) through (9) form the 
desired recursive equations for computing It and p t . 

APPENDIX B 
Estimation of fi and a 2 

Given the observed transformed defect indices Y , Yi, Y2, • • • , Y„ 
one can compute Z t = Y t — Y t -\ for t = 1, • • • , it. The Z t series follows 
MA(1) model given by eq. (13). The exponentially discounted proba- 
bility density function of a.\ , • • • , a t is given by 

-iA. 

(47) 



t 
p(a u ••• ,a,\a 2 ) = JJ 



exp(-a5/2a 2 ) 



V2~ 2 



27ra 



where A, = X'~ J . Thus, the exponentially discounted probability density 
function of Z\ , • • • , Z, conditional on the knowledge of ao, /?, and a 2 is 
given by 

-lA. 



7=1 



vS 



exp(-a 2 /2cr 2 ) 



(48) 
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where a> is related to Z, and /? via the recursion relation in eq. (13), i.e., 

a J = Z J -pa j - i . (49) 

Thus, the conditional, exponentially discounted, log-likelihood func- 



tion is 



where 



and 



L t (P, a 2 ) = -K{A, In a 2 + S,/a 2 ), 



A, = I X 



t-j 



S t - S V" y o|. 



(50) 



(51) 



(52) 



/-] 



By differentiating eq. (50) with respect to /? and a 2 and equating the 
derivatives to zero, it can be shown that L t is maximum at (fit, a 2 ,), 
where % is the minimum point of S f (P) and a 2 = S t (ft t )/Ai . 

In the neighborhood of a point /? , we can approximate a, by the 
linear function: 



(54) 



Substituting this approximation in eq. (52) we have 

S t (P) « S,(/3 Q ) + {P- Po)v< + l MP - p ) 2 Rt, 

where v h R, and S t {p ) obey the recursion relations shown in eqs. (20) 
through (26). It is easy to verify that p,, given by eq. (18) minimizes 
the approximate S,(P) of eq. (54), and eq. (19) gives a 2 . 
The matrix of second partial derivatives of L, is 



r r, 









2a' 






A, 


o 






2a 4 ] 



so by the Fisher-information theory, the estimates /?, and a 2 , are 
uncorrected and their approximate variances are as given by eqs. (27) 
and (28). 

The above recursive procedure also has a Bayesian interpretation, 
as given by Phadke. 6 

APPENDIX C 

Summary of the Formulae 

C. 1 Initial conditions 

rho= 1.0 
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q = 0.134 
Yo - 1.0 
/So = -0.6 

A = 0.95 
dao 

0.625 



So(/0o) - 



e (l - A) 



p = 

flo = 20.0/eo 
A = 1/(1 -A) 

ojo = 0.25/eo 

C.2 Recursive formulae 

It = x,/e t 

Y t = >fi t 

Z t = Y t - Y t -x 

a 2 n ,t = 0.25/e, 

a t = Z t — f3oa t -\ 

da t da t -\ 

— - = - a t -i - [So— — 
dp dp 

St(fa) = AS,-,(/?„) + a 2 

x « da t 

v t = Kv t -\ + 2a, -— 

dp 

*< =A/e <- + 2 (|f 

A, = XAt-i + 1 

fit = ft, - R7 x v,, -l<>ff? <0 

s t (p?) = s,(/3 ) + (/?,* - p )v, + y 2 ( / gr - p ) 2 R t 

a? 2 = S t (p*t)/A t 
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"5 = A ^ + (1 - X)oii 
<ft = _ i g* a *2 _ ^ a?, > 

& - (1 + /??) V 2 
If- /?,*a,* 2 - "3 > 0, then 

fa = Pf&b 2 t = of. 
If- /3fof-~oJn <0, then 

B -(2 + &/?,) + V(2 + a 2 V4) 2 - 4 
ft" ~~2 

a? = -~a\,ifa. 



and 



u 2 t = 



*2 I _2 

a it + o v t 



ah + a^ + alt + qt-i 



rh t = u>2tfn t -\ + (1 — t02<) Yt 
It = Uufht + (1 - (Ok) y< 

{2o?[l + wa(l + 2fa)f/Rt} 
... + {26l[fa + (1 + A + A 2 )cQ2/] 2 M f ) , 1 

(af, + a 2 , + a%t + q t -i) 1* 

[261(1 + 2fa) 2 o&wh/Rt] 

+ [26 4 t(l + fa + $)" 2 iA/At\ . < J_ 

(fflt + (T2< + o$ t + gv-i) 12 

q t = (l - W2t)(a!/ + a%) + (Y t - m t _i) 2 V(« 2 t) 

p, = (1 - u u «»)oJf + (Yi _ mt-i) 2 V(iouw 2t ) 



C.3 Points for the box chart 
Current defect index 
Best estimate of the defect index 
The mean level 
99% quantile 
95% quantile 
5% quantile 
1% quantile 



It 



-fi 



M, = m? 

Qi = [max(f, - 2.326 y/p ti 0)] 2 

Q 2 = [max(f, - 1.645 Vp,, 0)] 2 

Q 3 = (£ + 1.645 v^) 2 

Q 4 = (ft + 2.326 v/7,) 2 
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